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Abstract 

We discuss a new method for setting limits on small signals in the presence of back- 
ground noise. The method is based on a combination of a two dimensional confidence 
region and the large sample approximation to the likelihood ratio test statistic. It 
automatically quotes upper limits for small signals and two-sided confidence inter- 
vals for larger samples. We show that this method gives the correct coverage and 
also has good power. 
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1 Introduction 



Finding confidence intervals or upper limits for small signals has recently at- 
tracted a great deal of attention. The paper by Feldman and Cousins U 
rekindled the interest in this area by providing a unified approach, that is, it 
based the choice of quoting an upper limit or a two-sided confidence interval 
on the data alone, without the experimenter having to make this decision. 
The unified approach uses the Neyman construction together with an order- 
ing principle based on likelihood ratios. Subsequent papers such as Giunti 
and Roe and Woodroofe gave variations of this method, basing the order- 
ing on other quantities. Common to all these methods is the need to have 
a fairly precise knowledge of the background rate, for example, from Monte 
Carlo simulations. Unfortunately, as we will see in section 4, these methods 
can fail when used in the presence of background uncertainty. A possible rem- 
edy is discussed in Cousins and Highland M where it is suggested to treat 
the background uncertainty as a systematic error. We will instead treat the 
background uncertainty as a statistical error and develop a method that is 
suitable in this situation. Our method is based on the likelihood ratio test 
statistic, together with an adjustment for those situations where there is very 
little (or no) signal observed. We will show that this method yields the correct 
coverage rate and that it has good power. 

The background rate has to be estimated either from the data or through 
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Monte Carlo. Both of those methods have their strengths and their weaknesses. 
Using data requires choosing sidebands, and implicitly makes the assumption 
that the density generating the background events is the same in the signal 
region as it is in the sidebands. This leads to a quandary: If we choose a small 
sideband, this assumption seems more reasonable, but then we will also see 
fewer background events and therefore have a higher statistical uncertainty in 
the estimate of the background rate. Choosing a large region might yield higher 
statistics but makes the assumption of a linear background more tenuous. 
An alternative way to estimate the background rate is by Monte Carlo. One 
problem with this approach is that a good Monte Carlo is often difficult to do 
because we can never be completely sure that we have modeled all the relevant 
effects correctly. This is particularly true when searching for small signals since 
the efforts to reduce the background often mean one is probing the tails of 
distributions which may be difficult to model. Also, in High Energy Physics 
today running a complete Monte Carlo simulation of an experiment can be 
a formidable task from a computational point of view, and it might not be 
possible to run enough Monte Carlo to effectively eliminate the uncertainty in 
the background estimate. For these reasons, it is in many cases not possible to 
ignore the uncertainty in the background rate. Our method is well equipped to 
deal with the uncertainty that comes from limited statistics in both situations, 
those where the background is estimated from the data as well as those where 
the background is estimated using Monte Carlo. 
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In some cases there are two different sources of background. It turns out that 
our method can be extended fairly easily to this situation also, regardless of 
the method of estimation used. 



2 A Description of the Method 

In this section we will outline the basic ideas of this new method. We will 
need the following notation. Assume that we observe x events in a suitably 
chosen signal region and a total of y events in the background region. Here 
the background region can be chosen fairly freely and need not be contiguous. 
Furthermore, we assume that the ratio of the size of the background region 
to the size of the signal region is r. For example, if we use two background 
regions of the same size as the signal region we get r = 2. Then a probability 
model for the data is given by 



where \i is the signal rate, b is the background rate and Pois is the usual 
Poisson distribution. We can assume X and Y to be independent and so 



X ~ Pois(fj J + b), 



Y ~ Pois{rb) 



P,, b (X = x,Y = y) 



x\ y\ 
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2.1 A Confidence Region for fi and b 



A common technique for constructing confidence regions (or intervals for one- 
dimensional parameters) is to find a corresponding hypothesis test and then 
to invert the test. We will start with a simultaneous hypothesis test for /i and 
b with the null hypothesis Hq : fi = fi , b = b$. The steps are as follows: 

(1) List all observations (ui,Vi),i = l,..,K, together with their probabilities, 
which are given by 



Here we use the values // and bo specified in the null hypothesis. We list 
only those observations which have a probability above a certain small 
threshold. In our algorithm we require the probabilities to be larger than 
lCT 6 . 

(2) Sort all observations from the most likely to the most unlikely. 

(3) Find the partial sums from the largest to the k th observation until you 
reach 1 — a, if the desired level of the test is a. 

(4) If the observed (x, y) appears in the list of possible observations before 
1 — a is reached, accept the null hypothesis, otherwise reject it. 

This is in effect a Neyman construction like the one used in Feldman and 
Cousins |[], only we use the probabilities as the ordering quantities. The test 
simply checks whether or not the observation (x, y) is compatible with the 
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rates /i and 60 specified in the null hypothesis. 

Using the likelihood ratios as the ordering quantity in a way similar to Feldman 
and Cousins |J would in principle be superior, but unfortunately does not yield 
a viable method here because the list of " likely" observations is infinite. 

The inversion of the hypothesis test then involves a search through all pairs 
(fi, b). If a certain pair leads to the acceptance of the null hypothesis, we add 
it to the confidence region, otherwise we do not. As an example we have figure 
1 where we show the confidence regions obtained for three observations. The 
boundaries of the confidence regions are somewhat ragged due to the discrete 
nature of the Poisson random variable. 



2.2 A Confidence Interval for fi 



Again we will start with a hypothesis test, but this time we will only fix the 
signal rate fx. The null hypothesis then becomes H : /i = /i . 

A popular test in Statistics for any kind of hypothesis test is the likelihood 
ratio test, which is based on the likelihood ratio test statistic A given in our 
problem by: 

max {l(fj, Q ,b\x,y) : b > 0} 
X,V) ~ max {l(fi, b- x, y) : fi > 0, b > 0} 

Here l(fi,b;x,y) = P^b(X = x,Y = y) is the likelihood function of ft and b 
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given the observation(x, y). This test statistic can be thought of as the ratio 
of the best explanation for the data if H is true and the best explanation 
for the data if no assumption is made on fi. The denominator is simply the 
likelihood function evaluated at the usual maximum likelihood estimator. To 
find the numerator we have to find the maximum likelihood estimator of the 
background rate b assuming that the signal rate is known to be /i . 



_bgi(^,&; irjy ) = _-l + -- T =0 

c )= x + y-(l + T)fi + yj(x + y-(l + r)/i ) 2 + 4(1 + t) Wq 

2(1 + r) 

b; x, y) is called the profile likelihood function of fi. The usefulness of the 
likelihood ratio test statistic lies in the fact that approximately we have 

-21ogA(/i ;x,y) ~ x 2 (d) 

that is —2 log A has an approximate Chi-Square distribution with d degrees of 
freedom, where d is the number of parameters in the model minus the number 
of parameters specified in the null hypothesis. Here we have d = 1. Because 
the profile likelihood I differs from the likelihood ratio A only by a constant 
independent of fi , we can concentrate on the profile likelihood. For more 
details on the likelihood ratio test statistic see Casella and Berger ||. For 
information on the profile likelihood see Bartlett ||, Lawley |7| and Murphy 
and Van Der Vaart ||. In figure 2 we have the profile likelihood function for 
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the case x = 6, y = 2, r = 2. To find a (1 — a) • 100% confidence interval 
we start at the minimum, which of course is at the usual maximum likelihood 
estimator, and then move to the left and to the right until the function rises 
by the a percentile of a x 2 distribution with 1 degree of freedom. 

The method here uses an approximation to the profile likelihood function by a 
quadratic function. Unfortunately, in cases where the number of observations 
in the signal region is small compared to the number of background events, 
the profile likelihood function becomes almost linear and this approximation 
does not work. For those cases we will use the following method: 

We overlay the confidence region described previously with the profile likeli- 
hood curve Then we find the smallest value of \x that is on this curve 
but not in the confidence region. Figure 3 illustrates this method. Clearly, 
in the case of fewer observations in the signal region than in the background 
region only an upper bound will be quoted. We will use this second method 
whenever the profile likelihood function has a positive derivative at /i — 0. 
It turns out that the limits obtained by these two methods are compatible. 
Figure 4 shows the upper bound for a number of cases with the limits obtained 
by the confidence region method drawn as squares and the limits using the 
likelihood ratio method as diamonds. The transition from one method to the 
other is quite smooth. 



9 



3 Extensions of the Method 



3. 1 Estimating background from Monte Carlo 

Our method can also be applied when a Monte Carlo with limited statistics 
is used to estimate the background rate. Assume we run the Monte Carlo n 
times and observe a total of y events. In the data we see x events in the signal 
region. Then a probability model for this situation is given by: 

X ~ Pois(/2 + b), Y ~ Pois{nb) 

We notice, of course, that this is actually the same model as the one previously 
used, only with an n instead of a r. Therefore, we can use our method for this 
situation without any changes. 

3.2 Include a second background source 

Sometimes there is a second source of background present in the data, this one 
characterized by the fact that it only appears in the signal region. An example 
is an invariant mass histogram where some of the background comes from the 
misidentification of pions with muons. Say we run a Monte Carlo n times and 
observe a total of z events of this type. In the data we have x observations in 
the signal region and y observations in a suitably chosen background region. 
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The probability model for this case is given by : 

X ~ Pois(/2 + b + r]),Y ~ Pois(rb) 
Z ~ Pois(nrj) 

where \i is the signal rate, b is the rate of the first background source and r\ is 
the rate of the second background source. Then: 

p»ax = *, y = v, z = z) = (M±^! e -(^,(lg e - rt M! e -^ 

We can extend our method to this situation in a fairly straightforward man- 
ner. First, we have to find the profile likelihood function, which leads to the 
following nonlinear system of equations: 

Slog/ x y 

— = ; 1 + - - r = 

ob fi + b + 1] b 

dlogl x z 

— = ; H n = 

or] /i + b + r] r] 

It turns out that fj can be found as the second largest root of the cubic equation 
ax 3 + bx 2 + cx + d = with 

a = (1 + n)(n — r) 

b—(x + z—(l + n)fx) (r - n) — (1 + n)(z + y) 
c — xz + (r — + z 2 + yz — (1 + 
d = fiz 2 
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and then b is given by 

t x + y-(l + T)(fi + 7j) + ^/(x + y-(l + T)(fi + r))) 2 + 4(1 + T )y(n + rj) 

b = ; : 

2(1 + r) 

The same problem as before arises again in this situation: in the case of few 
events in the signal region the profile likelihood function is nearly linear. We 
can use the same remedy as before by instead finding the intersection of the 
boundary of the three-dimensional confidence region in (/x, b, rj) space with the 
profile likelihood curve (ji, b, rj) . 

4 Performance of this Method 

In this section we will study the true coverage and the power of this method. 
For comparison we will use the unified approach of Feldman and Cousins 
0. Although that method was not designed to deal with uncertainty in the 
background, it is the standard method used in calculating confidence intervals 
in High Energy Physics at this time according to the Review of Particle Physics 
||. The coverage rates shown here were obtained through exact computation, 
without approximation. In figure 5 we have the case of one background region 
of equal size to the signal region, and finding 90% confidence intervals. The 
true background rates are 6 = 1,3 and 5 and the signal rates go from to 
5. Clearly our method has much better coverage than Feldman and Cousins 
|IJ; although, due to the approximation used in the method, the coverage is 
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sometimes slightly worse than the nominal one. The ragged appearance of 
the graph is due to the discrete nature of the Poisson distribution and in 
general is unavoidable. Figure 6 has the case of two background regions and a 
99% confidence interval. Again the new method performs very well, whereas 
Feldman and Cousins JI| does not achieve the nominal coverage. 

Next, we illustrate the performance of this method when the background rate 
is estimated by Monte Carlo. In figure 7 we compute the coverage rates for the 
cases fj, = and b = 0.5, 1.5 and 2.5. The Monte Carlo is run n times where 
we let n range from 1 to 10. Our method performs satisfactorily for all cases. 
This plot is also interesting because it gives some insight into the number of 
runs of the Monte Carlo needed before one can assume that the background 
rate is known. 

The performance of the extension of our method to the case of two differ- 
ent background sources discussed in section 3.2 has to be studied using mini 
Monte Carlo because the number of possible observations (x,y,z) becomes 
quite large. We have run a variety of these mini Monte Carlo studies. In Fig- 
ure 8 we show the results for the case r = 2, n = 10 and a 90% confidence 
interval. The true coverage rates of our method appear to be in line with 
the nominal ones, again with a few cases where the coverage is slightly worse 
due to the approximation used in the method. Due to the discreteness of the 
Poisson distribution the method is also quite conservative for many situations. 
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As Feldman and Cousins Jl| noted, there has been some criticism of their limits 
in cases where an experiment finds fewer events than are expected from the 
background rate. For example, an experiment with an expected background 
rate of b = 0.5 and no observed events would quote a Feldman- Cousins upper 
limit of 1.94; whereas an experiment with an expected background rate of 
b = 1.5 and no observed events would quote a Feldman-Cousins upper limit 
of 1.33. Our tables show a similar effect. Feldman and Cousins |l| explain this 
apparent inconsistency in some detail, and we are in full agreement with their 
reasoning. 

It may also come as somewhat of a surprise that our method yields limits that 
are sometimes smaller than the limits in Feldman and Cousins |]J . Contrary to 
the instance discussed in the previous paragraph, this happens in some cases 
where more events are observed in the signal region than in the sidebands. For 
example, consider the following situation: say we use one sideband of equal 
size to the signal region, that is we have r = 1, and we observe x — 1 events 
in the signal region and y = events in the sideband. Then the 90% upper 
limit using our method is found to be 3.65, whereas assuming 6 = and 
using Feldman and Cousins method gives a 90% upper limit of 4.36, or about 
19% larger. So here we get a smaller limit despite the fact that we have an 
additional uncertainty. This apparent paradox can be explained as follows: Let 
us consider the question whether the single event observed in the signal region 
is a signal or a background event. Using Feldman and Cousins ]I[ the answer 
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is clear: having assumed b = we know that there is no background, therefore 
this event has to be from the signal. In fact, it would be quite reasonable in 
this situation to actually quote a lower limit larger than 0, because we already 
have proof that fi > 0. On the other hand, in our method y = does not imply 
b = 0, it only means that b is not too large. It is therefore still possible that 
the event in the signal region is in fact a background event, and that = 0. 
It should come as no surprise, then, that in those cases we are quoting a 
smaller upper limit for /i than Feldman and Cousins J1J • Notice that in theory 
it would still be preferable to have absolute knowledge of the background rate 
because one could then make an actual observation of the signal rate rather 
than having to settle for setting a limit. In practice it is usually impossible to 
have such precise knowledge and, of course, nobody would claim a discovery 
based on just one event. 



5 Conclusion 

The methods for quoting limits for rare decays, mainly Feldman and Cousins 
ll and its variants, all suffer from the requirement that the background source 
be known with a high degree of precision. We have described a new method 
based on the likelihood ratio test which treats the background uncertainty as 
a statistical error. The performance of the method is shown to be quite good, 
with the true coverage rates close to the nominal ones and good power. The 
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method can be used in situations where the background rate has been esti- 
mated from data sidebands as well as where it has been obtained from Monte 
Carlo. The method can also be extended to cases where a second background 
source solely appearing in the signal region is present. 

In the appendix we provide tables for the confidence intervals and the exper- 
imental sensitivity as discussed in Feldman and Cousins [[!]] and in Review of 
Particle Physics || for the cases r = 1, 2 and a = 0.9, 0.99. The currently used 
algorithm for computing the limits becomes unreliable in some extreme cases, 
and in those cases we use NA in the tables. A FORTRAN program for the com- 
putation of the limits for any other case as well as for the extension discussed 
in section 4.2. can be obtained by writing to w_rolke@rumac.upr.clu.edu. 
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6 Appendix 




Fig. 1. Two dimensional confidence regions for x = 3, 4, 5 and 6 signal events, and 
y = 3 background events. The background region is twice the size of the signal 
region, and a 90% confidence level was used. 
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Fig. 2. Profile likelihood function with two-sided limits for the case of x = 6 events in 
the signal region and y = 2 events in the background region. The background region 
is twice the size of the signal region (r = 2). The nominal coverage probability is 
0.9. 
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x=0, y=1 



x=1, y=3 





Fig. 4. Upper limits obtained for a range of x and y values. Squares indicate that 
the confidence region method was used whereas diamonds indicate that the profile 
likelihood method was used. 
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Fig. 5. True coverage rates for our method and for the unified approach of Feldman 
and Cousins. The background rates are 6 = 1,3 and 5. The signal rates range from 
to 5. We have r = 1 and we use 90% confidence intervals. 
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b=l 



b=3 




Fig. 6. True coverage rates for our method and for the unified approach of Feldman 
and Cousins. The background rates are 6 = 1,3 and 5. The signal rates range from 
to 5. We have r = 2 and we use 99% confidence intervals. 
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b=2.5 




New Method 
Cousins-Feldman 



Fig. 7. Here the background is estimated with Monte Carlo. There is no signal 
present, and the Monte Carlo was run n times. 99% confidence intervals are used. 
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b=1, eta=1.3 



b=1, eta=0.5 




b=0.5, eta=1.3 
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Fig. 8. The case of two different background sources. Coverage rates are based on 
mini Monte Carlo. The simulation is for the case with a background region twice the 
size of the signal region. The Monte Carlo was done 10 times and a 90% confidence 
interval was found. 
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Table 1: 90% CI for Poisson signal \i with r = 1. y is the number of events 
observed in the background region and x is the number of events observed 
in the signal region. In this case the estimated background rate would be 
b = y- 
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Table 2: 90% CI for Poisson signal \i with r = 2. y is the number of events 
observed in the background region and x is the number of events observed 
in the signal region. In this case the estimated background rate would be 

b = y/2. 
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Table 3: 99% CI for Poisson signal \i with r = 1. y is the number of events 
observed in the background region and x is the number of events observed 
in the signal region. In this case the estimated background rate would be 
b = y- 
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Table 4: 99% CI for Poisson signal \i with r = 2. y is the number of events 
observed in the background region and x is the number of events observed 
in the signal region. In this case the estimated background rate would be 

b = y/2. 
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Table 5: Experimental sensitivity (defined as the average upper limit that 
would be obtained by an ensemble of experiments with the observed number 
of events y in the sidebands and no true signal). Note that the estimated 
background rate would be b = y/r. 
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